Speech Recognition System of Arabic Alphabet Based on a Telephony Arabic Corpus
نویسندگان
چکیده
Automatic recognition of spoken alphabets is one of the difficult tasks in the field of computer speech recognition. In this research, spoken Arabic alphabets are investigated from the speech recognition problem point of view. The system is designed to recognize an isolated whole-word speech. The Hidden Markov Model Toolkit (HTK) is used to implement the isolated word recognizer with phoneme based HMM models. In the training and testing phase of this system, isolated alphabets data sets are taken from the telephony Arabic speech corpus, SAAVB. This standard corpus was developed by KACST and it is classified as a noisy speech database. A hidden Markov model based speech recognition system was designed and tested with automatic Arabic alphabets recognition. Four different experiments were conducted on these subsets, the first three trained and tested by using each individual subset, the fourth one conducted on these three subsets collectively. The recognition system achieved 64.06% overall correct alphabets recognition using mixed training and testing subsets collectively.
منابع مشابه
Speech Recognition System of Arabic Digits based on A Telephony Arabic Corpus
Automatic recognition of spoken digits is one of the difficult tasks in the field of computer speech recognition. Spoken digits recognition process is required in many applications such as speech based telephone dialing, airline reservation, automatic directory to retrieve or send information, etc. These applications take numbers and alphabets as input. Arabic language is a Semitic language tha...
متن کاملUsing a Telephony Saudi Accented Arabic Corpus in Automatic Recognition of Spoken Arabic Digits
In this research, spoken Arabic digits are investigated from the speech recognition problem point of view. The system is designed to recognize an isolated whole-word speech. In the training and testing phase of this system, isolated digits data sets are taken from the telephony Arabic speech corpus, SAAVB. This standard corpus was developed by KACST and it is classified as a noisy speech databa...
متن کاملOff-line Arabic Handwritten Recognition Using a Novel Hybrid HMM-DNN Model
In order to facilitate the entry of data into the computer and its digitalization, automatic recognition of printed texts and manuscripts is one of the considerable aid to many applications. Research on automatic document recognition started decades ago with the recognition of isolated digits and letters, and today, due to advancements in machine learning methods, efforts are being made to iden...
متن کاملCollecting Data for Automatic Speech Recognition Systems in Dialectal Arabic Using Games with a Purpose
Building Automatic Speech Recognition (ASR) systems for spoken languages usually suffer from the problem of limited available transcriptions. Automatic Speech Recognition (ASR) systems require large speech corpora that contain speech and their corresponding transcriptions for training acoustic models. In this paper, we target the Egyptian dialectal Arabic. As other spoken languages, it is mainl...
متن کاملروشی جدید جهت استخراج موجودیتهای اسمی در عربی کلاسیک
In Natural Language Processing (NLP) studies, developing resources and tools makes a contribution to extension and effectiveness of researches in each language. In recent years, Arabic Named Entity Recognition (ANER) has been considered by NLP researchers due to a significant impact on improving other NLP tasks such as Machine translation, Information retrieval, question answering, query result...
متن کامل